A multi-level methodology for the automated translation of a coreference resolution dataset: an application to the Italian language
نویسندگان
چکیده
Abstract In the last decade, demand for readily accessible corpora has touched all areas of natural language processing, including coreference resolution. However, it is one least considered sub-fields in recent developments. Moreover, almost existing resources are only available English language. To overcome this lack, work proposes a methodology to create corpus resolution Italian exploiting knowledge annotated other languages. Starting from OntonNotes, translates and refines utterances obtain respecting grammar, dealing with language-specific phenomena preserving mentions. A quantitative qualitative evaluation performed assess well-formedness generated utterances, considering readability, grammaticality, acceptability indexes. The results have confirmed effectiveness generating good dataset starting an one. goodness also assessed by training model based on BERT model, achieving promising results. Even if been tailored languages, general basis easily extendable adapting small number language-dependent rules generalize most linguistic under examination.
منابع مشابه
developing a pattern based on speech acts and language functions for developing materials for the course “ the study of islamic texts translation”
هدف پژوهش حاضر ارائه ی الگویی بر اساس کنش گفتار و کارکرد زبان برای تدوین مطالب درس "بررسی آثار ترجمه شده ی اسلامی" می باشد. در الگوی جدید، جهت تدوین مطالب بهتر و جذاب تر، بر خلاف کتاب-های موجود، از مدل های سطوح گفتارِ آستین (1962)، گروه بندی عملکردهای گفتارِ سرل (1976) و کارکرد زبانیِ هالیدی (1978) بهره جسته شده است. برای این منظور، 57 آیه ی شریفه، به صورت تصادفی از بخش-های مختلف قرآن انتخاب گردید...
15 صفحه اولa synchronic and diachronic approach to the change route of address terms in the two recent centuries of persian language
terms of address as an important linguistics items provide valuable information about the interlocutors, their relationship and their circumstances. this study was done to investigate the change route of persian address terms in the two recent centuries including three historical periods of qajar, pahlavi and after the islamic revolution. data were extracted from a corpus consisting 24 novels w...
15 صفحه اولinvestigating the integration of translation technologies into translation programs in iranian universities: basis for a syllabus design in translation technology
today, information technology and computers are indispensable tools of any profession and translation technologies have become an indispensable part of translator’s workstation. with the increasing demands for high productivity and speed as well as consistency and with the rise of new demands for translation and localization, it is necessary for translators to be familiar with market demands an...
the underlying structure of language proficiency and the proficiency level
هدف از انجام این تخقیق بررسی رابطه احتمالی بین سطح مهارت زبان خارجی (foreign language proficiency) و ساختار مهارت زبان خارجی بود. تعداد 314 زبان آموز مونث و مذکر که عمدتا دانشجویان رشته های زبان انگلیسی در سطوح کارشناسی و کارشناسی ارشد بودند در این تحقیق شرکت کردند. از لحاظ سطح مهارت زبان خارجی شرکت کنندگان بسیار با هم متفاوت بودند، (75 نفر سطح پیشرفته، 113 نفر سطح متوسط، 126 سطح مقدماتی). کلا ...
15 صفحه اولذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neural Computing and Applications
سال: 2022
ISSN: ['0941-0643', '1433-3058']
DOI: https://doi.org/10.1007/s00521-022-07641-3